Patrick Altmeyer
CounterfactualExplanations.jl“You cannot appeal to (algorithms). They do not listen. Nor do they bend.”
— Cathy O’Neil in Weapons of Math Destruction, 2016
We have fitted some black box classifier to divide cats and dogs. One 🐱 is friends with a lot of cool 🐶 and wants to remain part of that group. The counterfactual path below shows her how to fool the classifier:
\[ \min_{\tilde{x} \in \mathcal{X}} h(\tilde{x}) \ \ \ \mbox{s. t.} \ \ \ M(\tilde{x}) = t \qquad(1)\]
\[ \tilde{x} = \arg \min_{\tilde{x}} \ell(M(\tilde{x}),t) + \lambda h(\tilde{x}) \qquad(2)\]
So counterfactual search is just gradient descent in the feature space 💡 Easy right?
Effective counterfactuals should meet certain criteria ✅
\[ \tilde{x} = \arg \min_{\tilde{x}} \ell(M(\tilde{x}),t) \ \ , \ \ \forall M\in\mathcal{\widetilde{M}} \qquad(3)\]
CounterfactualExplanations.jlFast, transparent, beautiful 🔴🟢🟣
CounterfactualExplanations.jl is a package for generating counterfactual explanations and aglorithmic recourse.
Using the package, generating counterfactuals is as easy as follows:
# Some random example:
w = [1.0 -2.0] # true coefficients
b = [0] # true constant
x = [-1,0.5] # factual in class 0
target = 1.0 # target
γ = 0.9 # desired confidence
# Declare model:
using CounterfactualExplanations.Models
𝑴 = LogisticModel(w, b)
# Counterfactual search:
generator = GenericGenerator(
0.1,0.1,1e-5,:logitbinarycrossentropy,nothing)
recourse = generate_counterfactual(
generator, x, 𝑴, target, γ)Designed to work with any custom model and generator through multiple dispatch.
This looks nice 🤓
And this … ugh 🥴
What happens once AR has actually been implemented? 👀
Explaining black box models through counterfactuals